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How informative are spatial CA3 representations established by the dentate gyrus? 
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In the mammalian hippocampus, the dentate gyrus (DG) is characterized by sparse and powerful 
unidirectional projections to CA3 pyramidal cells, the so-called mossy fibers. Mossy fiber synapses 
appear to duplicate, in terms of the information they convey, what CA3 cells already receive from 
cntorhinal cortex layer II cells, which project both to the dentate gyrus and to CA3. Computational 
models of episodic memory have hypothesized that the function of the mossy fibers is to enforce a 
new, well separated pattern of activity onto CA3 cells, to represent a new memory, prevailing over the 
interference produced by the traces of older memories already stored on CA3 recurrent collateral 
connections. Can this hypothesis apply also to spatial representations, as described by recent 
neurophysiological recordings in rats? To address this issue quantitatively, we estimate the amount 
of information DG can impart on a new CA3 pattern of spatial activity, using both mathematical 
analysis and computer simulations of a simplified model. We confirm that, also in the spatial case, 
the observed sparse connectivity and level of activity are most appropriate for driving memory 
storage - and not to initiate retrieval. Surprisingly, the model also indicates that even when DG 
codes just for space, much of the information it passes on to CA3 acquires a non-spatial and episodic 
character, akin to that of a random number generator. It is suggested that further hippocampal 
processing is required to make full spatial use of DG inputs. 

PACS numbers: 



I. INTRODUCTION 

The hippocampus presents the same organizaton 
across mammals, and distinct ones in reptiles and in 
birds. A most prominent and intriguing feature of the 
mammalian hippocampus is the dentate gyrus (DG). As 
reviewed in [Sy] , the dentate gyrus is positioned as a sort 
of intermediate station in the information flow between 
the entorhinal cortex and the CA3 region of the hip- 
pocampus proper. Since CA3 receives also direct, per- 
forant path connections from entorhinal cortex, the DG 
inputs to CA3, called mossy fibers, appear to essentially 
duplicate the information that CA3 can already receive 
directly from the source. What may be the function of 
such a duplication? 

Within the view that the recurrent CA3 network oper- 
ates as an autoassociative memory [28| , [35| , it has been 
suggested that the mossy fibers (MF) inputs are those 
that drive the storage of new representations, whereas the 
perforant path (PP) inputs relay the cue that initiates the 
retrieval of a previously stored representation, through 
attractor dynamics, due largely to recurrent connections 
(RC). Such a proposal is supported by a mathematical 
model which allows a rough estimate of the amount of 
information, in bits, that different inputs may impart to 
a new CA3 representati on J4a | . That model, however, is 
formulated in the Marr [25| framework of discrete mem- 
ory states, each of which is represented by a single activ- 
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ity configuration or firing pattern. 

Conversely, the prediction that MF inputs may be im- 
portant for storage and not for retrieval has received ten- 
tative experimental support from experiments with spa- 
tial tasks, either the Morris water maze [20| or a dry 
maze 21]. Two-dimensional spatial representations, to 
be compatible with the attractor dynamics scenario, re- 
quire a multiplicity of memory states, which approximate 
a 2D continuous manifold, isomorphic to the spatial en- 
vironment to be represented. Moreover, there has to be 
of course a multiplicity of manifolds, to represent distinct 
environments with complete remapping from one to the 
other 23] . Attractor dynamics then occurs along the di- 
mensions locally orthogonal to each manifold, as in the 
simplified "multi-chart" model 0], whereas tangen- 
tially one expects marginal stability, allowing for small 
signals related to the movement of the animal, reflect- 
ing changing sensory cues as well as path integration, to 
displace a "bump" of activity on the manifold, as appro- 
priate H, 0- 

Although the notion of a really continuous attractor 
manifold appears as a limit case, which can only be ap- 
proximated by a network of finite size [47| , [l5[ , [33[ , [3(| , 
even the limit case raises the issue of how a 2D attractor 
manifold can be established. In the rodent hippocampus, 
the above theoretical suggestion and partial experimental 
evidence point at a dominant role of the dentate gyrus, 
but it has remained unclear how the dentate gyrus, with 
its MF projections to CA3, can drive the establishment 
not just of a discrete pattern of activity, as envisaged by 
[III, but of an entire spatial representation, in its full 
2D glory. This paper reports the analysis of a simplified 
mathematical model aimed at addressing this issue in a 
quantitative, information theoretical fashion. 

Such an analysis would have been difficult even only 
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a few years ago, before the experimental discoveries that 
largely clarified, in the rodent, the nature of the spatial 
representations in the regions that feed into CA3. First, 
roughly half of the entorhinal PP inputs, those coming 
from layer II of the medial portion of entorhinal cortex, 
were found to be in the form of grid cells, i.e. units 
that are activated when the animal is in one of multiple 
regions, arranged on a regular triangular grid [14]. Sec- 
ond, the sparse activity earlier described in DG granule 
cells [l6| was found to be concentrated on cells also with 
multiple fields, but irregularly arranged in the environ- 
ment [22]. These discoveries can now inform a simpli- 
fied mathematical model, which would have earlier been 
based on ill-defined assumptions. Third, over the last 
decade neurogenesis in the adult dentate gyrus has been 
established as a quantitatively constrained but still sig- 
nificant phenomenon, stimulating novel ideas about its 
functional role 0]. The first and third of these phenom- 
ena will be considered in extended versions of our model, 
to be analysed elsewhere; here, we focus on the role of 
the multiple DG place fields in establishing novel CA3 
representations. 

A. A simplified mathematical model 

The complete model considers the firing rate of a CA3 
pyramidal cell, rji, to be determined by the firing rates 



{i]} of other cells in CA3, which influence it through RC 
connections; by the firing rates {(3} of DG granule cells, 
which feed into it through MF connections; by the firing 
rates {</?} of layer II pyramidal cells in entorhinal cor- 
tex (medial and lateral), which project to CA3 through 
PP axons; and by various feedforward and feedback in- 
hibitory units. A most important simplification is that 
the fine temporal dynamics, e.g. on theta and gamma 
time scales, is neglected altogether, so that with "firing 
rate" we mean an average over a time of order the theta 
period. Information coding over shorter time scales re- 
quires a more complex analysis, which is left to future 
refinements of the model. 

For the different systems of connections, we as- 
sume the existence of anatomical synapses between any 
two cells to be represented by fixed binary matrices 
{c pp }, {c MF }, {c RC } taking or 1 values, whereas the 
efficacy of those synapses to be described by matrices 
{J PP },{J MF },{J RC }. The effect of inhibition and of 
the current threshold for activating a cell are summarized 
into a subtractive term, of which we denote with T the 
mean value across CA3 cells, and with Si the deviation 
from the mean for a particular cell i. 

Assuming finally a simple threshold-linear activation 
function [421 ] for the relation between the activating cur- 
rent and the output firing rate, we write 



T 



(1) 



where [•]+ indicates taking the sum inside the brackets 
if positive in value, and zero if negative, and g is a gain 
factor. The firing rates of the various populations are all 
assumed to depend on the position x of the animal, and 
the notation is chosen to minimize differences with our 
previous analyses of other components of the hippocam- 
pal system (e.g. [Hj], (3). 



B. The storage of a new representation 

When the animal is exposed to a new environment, we 
make the drastic modelling assumption that the new CA3 
representation be driven solely by MF inputs, while PP 
and RC inputs provide interfering information, reflecting 
the storage of previous representations on those synaptic 
systems, i.e., noise. We reabsorb the mean of such noise 
into the mean of the "thrcshold+inhibition" term T and 



similarly for the deviation from the mean, writing 



§4 — T 



(2) 



where the gain has been set to g = 1 , without loss of gen- 
erality, by an appropriate choice of the units in which to 
measure {c MF },{J MF } (pure numbers) and Si,T (s _1 ). 

As for the MF inputs, we consider a couple of sim- 
plified models that capture the essential finding by [22l ]. 
of the irregularly arranged multiple fields, as well as the 
observed low activity level of DG granule cells [9(, while 
retaining the mathematical simplicity that favours an an- 
alytical treatment. We thus assume that only a randomly 
selected fraction poc of the granule cells are active in a 
new environment, of size A, and that those units are ac- 
tive in a variable number Qj of locations, with Qj drawn 
from a distribution with mean q. In model A the distri- 
bution is taken to be Poisson (the data reported by Leut- 
geb et al [13] are fit very well by a Poisson distribution 
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with q = 1.7, but their sampling is limited). In model 
B the distribution is taken to be exponential (this better 
describes the results of the simulations in [39|], though 
that simple model may well be inappropriate). There- 
fore, in either model, the firing rate 0j(x) of DG unit j 
is a combination of Qj gaussian "bumps" , or fields, of 
equal effective size (07 ) 2 and equal height 0q, centered 
at random points Xjk in the new environment 



3 (x) = £ 0o 



2 "} 



(3) 



fc=0 



The informative inputs driving the firing of a CA3 
pyramidal cell, during storage of a new representation, re- 
sult therefore from a combination of three distributions, 
in the model. The first, Poisson but close to normal, 
determines the MF connectivity, that is how it is that 
each CA3 unit receives only a few tens of connections 
out of Ndg — 10 6 granule cells (in the rat), whereby 



MF 



= 1) = Cmf/N DG EE C MF . 



{c% F } = 0,1 with P(c 

The second, Poisson, determines which of the DG units 
presynaptic to a CA3 unit is active in the new environ- 
ment, with P(unit j is active) = pdg- The third, either 
Poisson or exponential (and see model C below), deter- 
mines how many fields an active DG unit has in the new 
environment. Note that in the rat Cmf — 46 [3| whereas 
Pdg ~ 0.02-^0.05, even when considering presumed new- 
born neurons 1. As a result, the total number of active 
DG units presynaptic to a given CA3 unit, pdgCmf = a, 
is of order one, a ~ 1 ~ 2, so that the second Poisson 
distribution effectively dominates over the first, and the 
number of active MF impinging on a CA3 unit can ap- 
proximately be taken to be itself a Poisson variable with 
mean a. As a qualification to such an approximation, 
one has to consider that different CA3 pyramidal cells, 
among the NcA3 — 3 x 10 5 present in the rat (on each 
side), occasionally receive inputs from the same active 
DG granule cells, but rarely, as Ndg — 10 6 , hence the 
pool of active units pdgNdg is only one order of magni- 
tude smaller than the population of receiving units Nga3- 

In a further simplification, we consider the MF synap- 
tic weights to be uniform in value, jfj F = J. This as- 
sumption, like those of equal height and width of the DG 
firing fields, is convenient for the analytical treatment 
but not necessary for the simulations. It will be relaxed 
later, in the computer simulations addressing the effect 
of MF synaptic plasticity. 

The new representation is therefore taken to be estab- 
lished by an informative signal coming from the dentate 
gyrus 



(4) 



modulated, independently for each CA3 unit, by a noise 
term Si, reflecting recurrent and perforant path inputs 
as well as other sources of variability, and which we take 
to be normally distributed with zero mean and standard 
deviation <5. 



The position x of the animal determines the firing {0} 
of DG units, which in turn determine the probability dis- 
tribution for the firing rate of any given CA3 pyramidal 
unit 



where 



*((?(*)) = 



'2tt 



g(x) 



~ t2 l 2 dt 



is the integral of the gaussian noise up to given signal-to- 
noise ratio 

g(x) ee fj(x)/6, 

and 0(7/) is Heaviside's function vanishing for negative 
values of its argument. The first term, multiplying 
Dirac's 5 (rji), expresses the fact that negative activation 
values result in zero firing rates, rather than negative 
rates. 

Note that the resulting sparsity, i.e. how many of the 
CA3 units end up firing significantly at each position, 
which is a main factor affecting memory storage [42| . is 
determined by the threshold T, once the other param- 
eters have been set. The approach taken here is to as- 
sume that the system requires the new representation to 
be sparse and regulates the threshold accordingly. We 
therefore set cigaz — 0.1, in broad agreement with ex- 
perimental data [331 ] . and adjust T (as shown, for the 
mathematical analysis, in Sect llV C| . 

The distribution of fields per DG unit is given in model 
A by the Poisson form 

Pa(Q) = SLe- 
in model B by the exponential form 



Pb(Q) 



l+q\l+q 



and we also consider as a control case model C, where 
each DG unit has one and only one field 

Pc(Q) = Si Q . 



C. Assessing spatial information content 

In the model, spatial position x is represented by CA3 
units, whose activity is informed about position by the 
activity of DG units, each determined independently of 
others by its place fields 



p({0(m = U p ^w) 
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with 

P (&(£)) = (1 - PDG ) 5 (ft) + PDG x 

oo I Qj 

^2 Pa ' b ° r c(Qj) 5 \/3j - tp (x - Xjk) 

Qj=Q \ k=l 

where each contributing field is a gaussian bump 



1p(x- Xjk) = 




The Mutual Information / (0, {r)i}) quantifies the effi- 
ciency with which CA3 activity codes for position, on 
average, as 

(/(*,{*})) = (^i ({*})> -«ff a ({^}|^)>«f> (5) 

where the outer brackets (■) indicate that the average 
is not just over the noise <5, as usual in the estimation 
of mutual information, but also, in our case, over the 
quenched, i.e. constant but unknown values of the mi- 
croscopic quantities Cjj , the connectivity matrix, Qj , the 



number of fields per active unit, and Xjk, their centers. 
For given values of the quenched variables, the total en- 
tropy Hi and the (average) equivocation H 2 are defined 

as 

Hi({Vi}) = - [HdrkP({Tli})tog(.P({TH}m 

J i 

(H 2 ({ m }\x)) s = - f{dx/A)J{d m P{{ m }\S)x 

\og(P({r H }\x)) (7) 

where A is the area of the given environment; the logs 
are intended in base 2, to yield information values in bits. 

The estimation of the mutual information can be ap- 
proached analytically directly from these formulas, using 
the replica trick (see |2§|), as shown by [33| and [ic[ . and 
briefly described in Sect llV Al As in those two studies, 
however, here too we are only able to complete the deriva- 
tion in the limit of low signal-to-noise, or more precisely 
of limited variation, across space, of the signal-to-noise 
around its mean, that is < (gi(x)~ < Qi{x) >g) 2 >s^ 0. 
In this case we obtain, to first order in N = Nqa3, an 
expression that can be shown to be equivalent to 



N 
ln2 



^|$(- ft (2))ln*(- ft (f))-$(- ft (f))ln J ^{- Qi {x,)) 

dx dxl ( 3>(Qi(x)) . ._. /-,m2 r / /-w 

\ g \ x ) ~ ft \ x ')\ + m\ x ) ~ Qi{xi)\ a{Qi{x)) 

dx dxl dx// <&(gAx)) . ... ,_ ..o. 
~X^-— 4 ^ - ft (*")]> 



(8) 



where we use the notation a(g) = (l/v2~7r) exp — g 2 /2 (cp. [lOj, Eqs.17, 45). 

Being limited to the first order in N, the expression above can be obtained in a straightforward manner by directly 
expanding the logarithms, in the large noise limit S — ► oo, in the simpler formula quantifying the information conveyed 
by a single CA3 unit 



1 T dx ( f drl 



dx 
~A 



dri (q-*7(f» 2 

-e 2^ In 



dv i? 2 (x)-Tj 2 (;/)-2T|(ij(x)-ii(H)) 

-f e ^ 
A 



(9) 



This single-unit formula cannot quantify the higher- 
order contributions in N, which decrease the informa- 
tion conveyed by a population in which some of the units 
inevitably convey some of the same information. The 
replica derivation, instead, in principle would allow one 
to take into proper account such correlated selectivity, 
which ultimately results in the information conveyed by 
large CA3 populations not scaling up linearly with N, 
and saturating instead once enough CA3 units have been 



sampled, as shown in related models by [13] , [HI • In our 
case however the calculation of e.g. the second order 
terms in N is further complicated by the fact that differ- 
ent CA3 units receive inputs coming from partially over- 
lapping subsets of DG units. This may cause saturation 
at a lower level, once all DG units have been effectively 
sampled. The interested reader can follow the derivation 
sketched in Sect llV Al 

Having to take, in any case, the large noise limit im- 



plies that the resulting formula is not really applicable to 
neuronally plausible values of the parameters, but only 
to the uninteresting case in which DG units impart very 
little information onto CA3 units. Therefore we use only 
the single-unit formula, and resort to computer simula- 
tions to assess the effects of correlated DG inputs. Sects. 
IIVBI and IIV CI indicate how to obtain numerical results 
by evaluating the expression in Eq. [5] 

Computer simulations can be used to estimate the in- 
formation present in samples of CA3 units of arbitrary 
size, and at arbitrary levels of noise, but at the price of an 
indirect decoding procedure. A decoding step is required 
because the dimensionality of the space spanned by the 
CA3 activity {rji} is too high. The decoding method we 
use, described in Sect IIV D~3l leads to two different types 
of information estimates, based on either the full or re- 
duced localization matrix. The difference between the 
two is illustrated under Results and further discussed at 
the end of the paper. 
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FIG. 1: Network scheme: The DG-CA3 system indicat- 
ing examples of the fields attributed to DG units and of those 
resulting in CA3 units, the connectivity between the two pop- 
ulations, and the noise 5 that replaces, in the model, also the 
effect of recurrent connections in CA3. 



II. RESULTS 

The essential mechanism described by the model is 
very simple, as illustrated in Fig.l. CA3 units which 
happen to receive a few DG overlapping fields combine 
them in a resulting field of their own, that can survive 
thresholding. The devil is in the quantitative details: 
how often this occurs, how large are the output fields, 
how distinct above the noise, all factors that determine 
the information contained in the spatial representation. 
Note that the same CA3 unit can express multiple fields. 

It is convenient to discuss such quantitative details 
with reference to a standard set of parameters. Our 
model of reference is a network of DG units with fields 
representated by Gaussian-like functions of space. The 
number of fields per each DG units is given by a Pois- 
son distribution with mean value q, and parameters as 
specified in Table [U 

In general, the stronger the mean DG input, the more 
it dominates over the noise, and also the higher the 
threshold has to be set in CA3 to make the pattern of 
activity as sparse as required. To control for the trivial 
advantage of a higher signal-to-noise, we perform com- 
parisons in which it is kept fixed, by adjusting e.g. the 
MF synaptic strength J. 

A. Multiple input cells vs. multiple fields per cell 

The first parameter we considered is q, in light of the 
recent finding that DG units active in a restricted envi- 
ronment appear to have more often multiple fields than 
CA3 units, and much more often than expected, given 
their weak probability of being active ■ We wondered 
whether receiving multiple fields from the same input 
units would be advantageous for CA3, and if so whether 
there is an optimal q value. We therefore estimated the 



mutual information when q varies and /i, the total mean 
number of DG fields that each CA3 cell receives as input, 
is kept fixed, by varying Cmf correspondigly. Initially, 
we did indeed find an optimal q value, that appeared to 
maximize the information available in CA3, and the opti- 
mal value was consistent with the recordings of [13] • Af- 
ter discovering a mistake in our initial analyses, however, 
we have realized that varying q in this way makes very lit- 
tle difference. Fig. 2 reports the results of computer sim- 
ulations, that illustrate also the dependence of the mu- 
tual information on NcA3, the number of cells sampled. 
The dependence is sub-linear, but rather smooth, with 
significant fluctuations from sample-to-sample which are 
largely averaged out in the graph. The different lines cor- 
respond to different distributions of the input DG fields 
among active DG cells projecting to CA3, that is differ- 
ent combinations of values for q and Cmf = ^/(qPdg), 
with /i kept constant; these different distributions do not 
affect much the information in the representation. 

The analytical estimate of the information per CA3 
unit confirms that there is no dependence on q (Fig. [21 
inset). This is not a trivial result, as it would be if 
only the parameter [i entered the analytical expression. 
Instead, Sect. IIV Bl shows that the parameters C m of 
the m-field decomposition depend separately on q and 
a = pdgC-mf, so the fact that the two separate depen- 
dencies almost cancel out in a single dependence on their 
product, /i, is remarkable. Moreover, such analytical es- 
timate of the information conveyed by one unit does not 
match the first datapoints, for NcA3 = 1, extracted from 
the computer simulation; it is not higher, as might have 
been expected considering that the simulation requires an 
additional information loosing decoding step, but lower, 
by over a factor of 2. The finding that the analytical 
estimate differs from, and is in fact much lower than, 
the slope parameter extracted from the simulations, af- 
ter the decoding step, is further discussed below. What 
the simulations and the analytical estimate have in com- 
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parameter 


symbol 


standard value 


probability a DG unit is active in one environment 


PDG 


0.033 


mean number of DG inputs to a CA3 unit 


Cmf 


50 


mean number of fields per active DG unit 


<7 


1.7 


mean number of fields activating a CA3 unit 




C m fPdgQ = 2.833 


strength of MF inputs 


J 


1, otherwise 2.833//1 


noise affecting CA3 activity 


5 


1 (in units in which /3o = 2.02) 


sparsity of CA3 activity 


acA3 


0.1 



TABLE I: Parameters used in the standard version of the model. 
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10 20 30 

N ca 3 

FIG. 2: The exact multiplicity of fields in DG units 
is irrelevant. Information about position plotted versus the 
number of CA3 units from which it is decoded, with the mean 
number of fields in the input to each CA3 unit constant at the 
value /i = 2.833. Different lines correspond to a different mean 
number of fields per DG input units, balanced by different 
mean number of input units per CA3 unit. Inset: analytical 
estimate of the information per CA3 unit, from numerically 
integrating Eq. [9] 

mon, beyond their incongruence in absolute values, is the 
absence of separate dependencies on q and a, as shown 
in Fig. 1 



B. More MF connections, but weaker 

Motivated by the striking sparsity of MF connections, 
compared to the thousands of RC and PP synaptic con- 
nections impinging on CA3 cells in the rat, we have then 
tested the effect of changing Cmf without changing q. In 
order to vary the mean number of DG units that project 
to a single CA3 unit, while keeping constant the total 
mean input strength, assumed to be an independent bio- 
physically constrained parameter, we varied inversely to 
Cmf the synaptic strength parameter J. As shown in 
Fig. [3l the information presents a maximum at some in- 
termediate value Cmf — 20 — 30, which is observed both 
in simulations and in the analytical estimate, despite the 
fact that again they differ by more than a factor of two. 



Again we find that the analytical estimate differs from, 
and is in fact much lower than, the slope parameter 
extracted from the simulations, after the decoding ste. 
Both measures, however, show that the standard model 
is not indifferent to how sparse are the MF connections. 
If they are very sparse, most CA3 units receive no inputs 
from active DG units, and the competition induced by 
the sparsity constraint tends to be won, at any point in 
space, by those few CA3 units that are receiving input 
from just one active DG unit. The resulting mapping is 
effectively one-to-one, unit-to-unit, and this is not opti- 
mal information-wise, because too few CA3 units are ac- 
tive - many of them in fact have multiple holds (Fig[3jD), 
reflecting the multiple holds of their "parent" units in 
DG. As Cmf increases (with a corresponding decrease in 
MF synaptic weight), the units that win the competition 
tend to be those that summate inputs from two or more 
concurrently active DG units. The mapping ceases to 
be one-to-one, and this increases the amount of informa- 
tion, up to a point. When Cmf is large enough that CA3 
units begin to sample more effectively DG activity, those 
that win the competition tend to be the "happy few" 
that happen to summate several active DG inputs, and 
this tends to occur at only one place in the environment. 
As a result, an ever smaller fraction of CA3 units have 
place fields, and those tend to have just one, often very 
irregular, as shown in FigOJ). From that point on, the in- 
formation in the representation decreases monotonically. 
The optimal MF connectivity is then in the range which 
maximizes the fraction of CA3 units that have a field in 
the newly learned environment, at a value, roughly one 
third, broadly consistent with experimental data (see e.g. 
0). 



It is important to emphasize that what we are report- 
ing is a quantitative effect: the underlying mechanism is 
always the same, the random summation of inputs from 
active DG units. DG in the model effectively operates 
as a sort of random number generator, whatever the val- 
ues of the various parameters. How informative are the 
CA3 representations established by that random number 
generator, however, depends on the values of the param- 
eters. 
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FIG. 3: A sparse MF connectivity is optimal, but not too sparse, (a) Information plotted versus the number of CA3 
cells, with different colors for different values of Cmf- Dots represent information values obtained from simulations, while 
curves are exponentially saturating fits to the data points, as described in Methods, (b) Plot of the two parameters of the fit 
curves. Main figure: slope parameter describing the slope of the linear part of the curve (for low Ncas), constrasted with the 
analytical estimate of the term proportional to Nca3 (Eq(9j); inset: total information parameter, describing the saturation level 
reached by the curve. 
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FIG. 4: Information vs. connectivity: (a) Examples of CA3 firing rate maps for Cmf = 7 (top row); Cmf = 29 (middle) 
and Cmf ~ 150 (bottom); (b) Histogram that shows the fraction of CA3 units active somewhere in the environment, left, and 
the fraction of these with more than one field, right, for different Cmf values. 



C. Other DG field distribution models 

We repeated the simulations using other models for 
the DG fields distribution, the exponential (model B) 
and the single field one (model C), and the results are 
similar to those obtained for model A: the information 
has a maximum when varying Cmf on its own, and is 
instead roughly constant if the parameter fi is held con- 
stant (by varying q inversely to Cmf)- Fig. [5] reports 
the comparison, as Cmf varies, between models A and 
B, with q — 1.7, and model C, where q = 1, so that 
in this latter case the inputs are 1/1.7 times weaker (we 
did not compensate by multiplying J by 1.7). Informa- 
tion measures are obtained by decoding several samples 
of 10 units, averaging and dividing by 10, and not by 



extracting the fit parameters. As one can see, the lower 
mean input for model C leads to lower information val- 
ues, but the trend with Cmf is the same in all three 
models. This further indicates that the multiplicity of 
fields in DG units, as well as its exact distribution, is of 
no major consequence, if comparisons are made keeping 
constant the mean number of fields in the input to a CA3 
unit. 



D. Sparsity of DG activity 

We study also how the level of DG activity affects 
the information flow. We choose diffferent values for the 
probability pdg that a single DG unit fires in the given 
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FIG. 5: Information vs. connectivity: Information plot- 
ted versus different values of connectivity between DG and 
CA3. Solid lines are all from simulations (localization infor- 
mation from samples of 10 units, divided by 10), as follows: 
for the blue line, the distribution defining the number of fields 
in DG cells is Poisson (model A); for the green line, it is expo- 
nential (model B); and for the red line, each DG active unit 
has one field only (model C). 

environment, and again we adjust the synaptic weight J 
to keep the mean DG input per CA3 cell constant across 
the comparisons. 

Results are simular to those obtained varying the spar- 
sity of the MF connections. Indeed, the analytical esti- 
mate in the two conditions would be exactly the same, 
within the approximation with which we compute it, be- 
cause the two parameters pdg and Cmf enter the cal- 
culation in equivalent form, as a product. The actual 
difference between the two parameters stems from the 
fact that increasing Cmf, CA3 units end up sampling 
more and more the same limited population of active 
DG units, while increasing pdg this population increases 
in size. This difference can only be appreciated from the 
simulations, which however show that the main effect 
remains the same: an information maximum for rather 
sparse DG activity (and sparse MF connections), The 
subtle difference between varying the two parameters can 
be seen better in the saturation information value: with 
reference to the standard case, in the center of the graph 
in the inset, to the right increasing pdg leads to more 
information than increasing Cmf, while to the left the 
opposite is the case, as expected. 

E. Full and simplified decoding procedures 

As noted above, we find that the analytical estimate 
of the information per unit is always considerably lower 
than the slope parameter of the fit to the measures ex- 
tracted from the simulations, contrary to expectations 
since the latter require an additional decoding step, 



which implies some loss of information. We also find, 
however, that the measures of mutual information that 
we extract from the simulations are strongly dependent 
on the method used, in the decoding step, to construct 
the "localization matrix" , i.e. the matrix which compiles 
the frequency with which the virtual rat was decoded 
as being in position x' when it was actually in position 
x. All measures reported so far, from simulations, are 
obtained constructing what we call the full localization 
matrix Q(x,x') which, if the square environment is dis- 
cretized into 20 x 20 spatial bins, is a large 400 x 400 
matrix, which requires of order 160,000 decoding events 
to be effectively sampled. We run simulations with tra- 
jectories of 400,000 steps, and additionally corrected the 
information measures to avoid the limited sampling bias 

An alternative, that allows extracting unbiased mea- 
sures from much shorter simulations, is to construct a 
simplified matrix Q(x-x'), which averages over decoding 
events with the same vector displacement between actual 
and decoded positions. Q{x — x') is easily constructed on 
the torus we used in all simulations, and being a much 
smaller 20 x 20 matrix it is effectively sampled in just a 
few thousand steps. 

The two decoding procedures, given that the simpli- 
fied matrix is the shifted average of the rows of the full 
matrix, might be expected to yield similar measures, but 
they do not, as shown in FigO The simplified matrix, by 
assuming translation invariance of the errors in decoding, 
is unable to quantify the information implicitly present 
in the full distribution of errors around each actual posi- 
tion. Such errors are of an "episodic" nature: the local 
view from position x might happen to be similar to that 
from position a?, hence neural activity reflecting in part 
local views might lead to confuse the two positions, but 
this does not imply that another position z has anything 
in common with z + (x' — x) . Our little network model 
captures this discrepancy, in showing, in FigjTl that for 
any actual position there are a few selected position that 
are likely to be erroneously decoded from the activity 
of a given sample of units; when constructing instead the 
translationally invariant simplified matrix, all average er- 
rors are distributed smoothly around the correct position 
(zero error), in a roughly Gaussian bell. 

Apparently, also the analytical estimate is unable 
to capture the spatial information implicit in such 
"episodic" errors, as its values are well below those ob- 
tained with the full matrix, and somewhat above those 
obtained with the simplified matrix (consistent with some 
loss with decoding). These differences do not alter the 
other results of our study, since they affect the height of 
the curves, not their shape, however they have important 
implications. The simplified matrix has the advantage 
of requiring much less data, i.e. less simulation time, 
but also less real data if applied to neurophysiological 
recordings, than the full matrix, and in most situations 
it might be the only feasible measure of spatial informa- 
tion (the analytical estimate is not available of course for 
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FIG. 6: Sparse DG activity is effective at driving CA3. (a) Information plotted versus the number of CA3 units, different 
colors correspond to different values for pdg- Dots represent information values obtained from simulations, while the curves 
are exponentially saturating fits to the data points, as described in Methods, (b) Plot of the two parameters of the fits. Main 
figure: slope parameter describing the slope of the linear part of the information curve (for low Nca3); inset: total information 
parameter describing the saturation level reached by the information - both are contrasted with the corresponding measures 
(dashed lines) obtained varying Cmf instead of pdg- 
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FIG. 7: Localization Matrices. Left: the rows of the full matrix represent the actual positions of the virtual rat while its 
columns represent decoded positions (the full matrix is actually 400 x 400); three examples of rows are shown, rendered here 
as 20 x 20 squares, all from decoding by a given sample of 10 units. The simplified matrix is a single 20 x 20 matrix obtained 
(from the same sample) as the average of the full matrix taking into account traslation invariance. Right: the two procedures 
lead to large quantitative differences in information (here, the measures from samples of 10 units, divided by 10), but with the 
same dependence on Cmf- 
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real data). So in most cases it is only practical to mea- 
sure spatial information with methods that, our model 
suggests, miss out much of the information present in 
neuronal activity, what we may refer to as "dark infor- 
mation" , not easily revealed. One might conjecture that 
the prevalence of dark information is linked to the ran- 
dom nature of the spatial code established by DG in- 
puts. It might be that additional stages of hippocampal 
processing, either with the refinement of recurrent CA3 
connections or in CAl, are instrumental in making dark 
information more transparent. 



F. Effect of learning on the Mossy Fibers 

While the results reported this far assume that MF 
weights are fixed, J = 1, we have also conducted a pre- 
liminary analysis of how the amount of spatial informa- 
tion in CA3 might change as a consequence of plasticity 
on the mossy fibers. In an extension of the standard 
model, we allow the weights of the connections between 
DG and CA3 to change with a model "Hebbian" rule. 
This is not an attempt to capture the nature of MF plas- 
ticity, which is not NMDA-dependent and might not be 
associative [30j, but only the adoption of a simple plas- 
ticity model that we use in other simulations. At each 
time step (that corresponds to a different place in space) 
weights are taken to change as follows: 

&4/ F (t) = imfthWWM*))- < /W)) >) ( 10 ) 

where jmf is a plasticity factor that regulates the 
amount of learning. Modifying in this way the MF 
weights has the general effect of increasing information 
values, so that they approach saturation levels for lower 
number of CA3 cells; in particular this is true for the 
information extracted from both full and simplified ma- 
trices. In Fig. [51 the effect of such "learning" is shown 
for different values of the parameter 7m f, as a function 
of connectivity. We see that allowing for this type of 
plasticity on mossy fibers leads to shift the maximum of 
information as a function of the connectivity level. The 
structuring of the weights effectively results in the se- 
lection of favorite input connections, for each CA3 unit, 
among a pool of availables ones; the remaining strong 
connections are a subset of those "anatomically" present 
originally. It is logical, then, that starting with a larger 
pool of connnections, among which to pick the "right" 
ones, leads to more information than starting with few 
connections, which further decrease in effective number 
with plasticity. We expect better models of the details of 
MF plasticity to preserve this main effect. 

A further effect of learning, along with the disappear- 
ance of some CA3 fields and the strengthening of others, 
is the refinement of their shape, as illustrated in Figj9l It 
is likely that also this effect will be observed even when 
using more biologically accurate models of MF plasticity. 
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FIG. 8: Information vs. connectivity for different lev- 
els of learning. Information is plotted as a function of 
the connectivity level between DG and CA3, different col- 
ors correspond to different values of the learning factor 7mf- 
Simulations run for 100,000 training steps, during a fraction 
~ acAi = 0.1 of which each postsynaptic units is strongly ac- 
tivated, and its incoming weights liable to be modified. The 
7 values tested hence span the range from minor modification 
of the existing weight, for 7 = 0.00005, to major restructuring 
of all available weights for 7 = 0.002. 




FIG. 9: MF plasticity can suppress, enlarge and in 
general refine CA3 place fields. The place fields of five 
example units are shown before plasticity is turned on (top 
row) and after 100,000 steps with a large plasticity factor 
7a/f = 0.0001 (bottom row). The rounding and regulariza- 
tion of the fields was observed also for several other units in 
the simulation. 



G. Retrieval abilities 

Finally, all simulations reported so far involved a full 
complement of DG inputs at each time step in the simu- 
lation. We have also tested the ability of the MF network 
to retrieve a spatial representation when fed with a de- 
graded input signal, with and without MF plasticity. The 
input is degraded, in our simulation, simply by turning 
on only a given fraction, randomly selected, of the DG 
units that would normally be active in the environment. 
The information extracted after decoding by a sample of 
units (in Fig. [T0l 10 units) is then contrasted with the 
size of the cue itself. In the absence of MF plasticity, 
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FIG. 10: Information reconstructed from a degraded 
input signal. Slope parameter I\ of the information curve 
as a function of the percentage of the DG input that CA3 re- 
ceives. Inset: the same plot for the total information param- 
eter 7oo . The same training protocol was run as for Figs. 1 8191 



there is obviously no real retrieval process to talk about, 
and the DG-CA3 network simply relays partial informa- 
tion. When Hebbian plasticity is turned on, the expecta- 
tion from similar network models (see e.g. (43j . Fig. 9) is 
that there would be some pattern completion, i.e. some 
tendency for the network to express nearly complete out- 
put information when the input is partial, resulting in a 
more sigmoidal input-output curve (the exact shape of 
the curve depends of course also on the particular mea- 
sure used). It is apparent from Fig. [10] that while, in 
the absence of plasticity, both parameters characterizing 
the information that can be extracted from CA3 grow 
roughly linearly with the size of the cue, with plasticity 
the growth is supralinear. This amounts to the statement 
that the beneficial effects of plasticity require a full cue to 
be felt - the conceptual opposite to pattern completion, 
the process of integrating a partial cue using information 
stored on modified synaptic weights. This result sug- 
gests that the sparse MF connectivity is sub-optimal for 
the associative storage that leads to pattern completion, 
a role that current perspectives ascribe instead to per- 
forant path and recurrent connections to CA3. The role 
of the mossy fibers, even if plastic, may be limited to the 
establishment of new spatial representations. 



III. DISCUSSION 

Ours is a minimal model, which by design overlooks 
several of the elements likely to play an important role 
in the functions of the dentate gyrus - perhaps foremost, 
neurogenesis [l9j]. Nevertheless, by virtue of its simplic- 
ity, the model helps clarify a number of quantitative is- 
sues that are important in refining a theoretical perspec- 



tive of how the dentate gyrus may work. 

First, the model indicates that the recently discovered 
multiplicity of place fields by active dentate granule cells 
[22| might be just a "fact of life" , with no major compu- 
tational implications for dentate information processing. 
Still, requiring that active granule cells express multiple 
fields seems to lead, in another simple network model (of 
how dentate activity may result from entorhinal cortex 
input 39]), to the necessity of inputs from lateral EC, 
therefore to the refinement of sequential computational 
constraints on the operation of hippocampal circuits. 

Second, the model shows that, assuming a fixed to- 
tal MF input strength on CA3 units, it is beneficial in 
information terms for the MF connectivity to be very 
sparse; but not vanishingly sparse. The optimal num- 
ber of anatomical MF connections on CA3 units depends 
somewhat on the various parameters (the noise in the 
system, how sparse is the activity in DG and CA3, etc.) 
and it may increase slightly when taking MF plasticity 
into account, but it appears within the range of the num- 
ber, 46, reported for the rat by It will be interest- 
ing to see whether future measures of MF connectivity 
in other species correspond to those "predicted" by our 
model once the appropriate values of the other parame- 
ters are also experimentally measured and inserted into 
the model. A similar set of consideration applies to the 
fraction of granule cells active in a given environment, 
PDGi which in the model plays a similar, though not com- 
pletely identical, role to Cmf in determining information 
content. 

Third, the model confirms that the sparse MF connec- 
tions, even when endowed with associative plasticity, are 
not appropriate as devices to store associations between 
input and output patterns of activity - they are just too 
sparse. This reinforces the earlier theoretical view [28| . 
[451 ] , which was not based however on an analysis of spa- 
tial representations, that the role of the dentate gyrus is 
in establishing new CA3 representations and not in as- 
sociating them to representations expressed elsewhere in 
the system. Availing itself of more precise experimental 
paramaters, and based on the spatial analysis, the cur- 
rent model can refine the earlier theoretical view and cor- 
rect, for example, the notion that "detonator" synapses, 
firing CA3 cells on a one-to-one basis, would be optimal 
for the mossy fiber system. The optimal situation turns 
out to be the one in which CA3 units are fired by the 
combination of a couple of DG input units, although this 
is only a statistical statement. Whatever the exact dis- 
tribution of the number of coincident inputs to CA3, DG 
can be seen as a sort of random pattern generator, that 
sets up a CA3 pattern of activity without any structure 
that can be related to its anatomical lay-out [34], or to 
the identity of the entorhinal cortex units that have ac- 
tivated the dentate gyrus. As with random number gen- 
erators in digital computers, once the product has been 
spit out, the exact process that led to it can be forgotten. 
This is consistent with experimental evidence that inac- 
tivating MF transmission or lesioning the DG does not 



12 



lead to hippocampal memory impairments once the infor- 
mation has already been stored, but lead to impairments 
in the storage of new information [2(3] , [HI . The inability 
of MF connection to subserve pattern completion is also 
consistent with suggestive evidence from imaging studies 
with human subjects 0]. 

Fourth, and more novel, our findings imply that a sub- 
stantial fraction of the information content of a spatial 
CA3 representation, over half when sampling limited sub- 
sets of CA3 units, can neither be extracted through the 
simplified method which assumes translation invariance, 
nor assessed through the analytical method (which any- 
way requires an underlying model of neuronal firing, and 
is hence only indirectly applicable to real neuronal data) . 
This large fraction of the information content is only ex- 
tracted through the time-consuming construction of the 
full localization matrix. To avoid the limited sampling 
bias [3ll ] this would require, in our hands, the equivalent 
of a ten hour session of recording from a running rat (!), 
with a square box sampled in 20 x 20 spatial bins. We 
have hence labeled this large fraction as dark informa- 
tion, which requires a special effort to reveal. Although 
we know little of how the real system decodes its own ac- 
tivity, e.g. in downstream neuronal populations, we may 
hypothesize that the difficulty at extracting dark infor- 
mation affects the real system as well, and that successive 
stages of hippocampal processing have evolved to address 
this issue. If so, qualitatively this could be characterized 
as the representation established in CA3 being episodic, 
i.e. based on an effectively random process that is func- 
tionally forgotten once completed, and later processing, 



e.g. in CAl, may be thought to gradually endow the 
representations with their appropriate continous spatial 
character. Another network model, intended to elucidate 
how CAl could operate in this respect, is the object of 
our on-going analysis. 

The model analysed here does not include neurogen- 
esis, a most striking dentate phenomenon, and thus it 
cannot comment on several intriguing models that have 
been put forward about the role of neurogenesis in the 
adult mammalian hippocampus [l|, Q, [4q |. Neverthe- 
less, presenting a simple and readily expandable model of 
dentate operation can facilitate the development of fur- 
ther models that address neurogenesis, and help interpret 
puzzling experimental observations. For example, the 
idea that once matured newborn cells may temporally 
"label" memories of episodes occurring over a few weeks 
[I3> EH* 0j EI nas been weakened by the observation 
that apparently even young adult-born cells, which are 
not that many 0], [2g|, [4l[, are very sparsely active, 
perhaps only a factor of two or so more active than older 
granule cells Q. Maybe such skepticism should be re- 
considered, and the issue reanalysed using a quantitative 
model like ours. One could then investigate the notion 
that the new cells link together, rather than separating, 
patterns of activity with common elements (such as the 
temporal label). To do that clearly requires extending 
the model to include a description not only of neuroge- 
nesis, but also of plasticity within DG itself [13] and of 
its role in the establishment of successive representations 
one after the other. 
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IV. METHODS 



A. Replica calculation 



Estimation of the equivocation 



Calculating the equivocation from its definition in Eq[7| is straightforward, thanks to the simplifying assumption of 
independent noise in CA3 units. We get 



N f dr 

(#2 (Mi*))* =^ ^{-^{-Qiim^i-Qm+HQm) 



ln(V27T(5) 



1 



Qi{x)a{Qi{x)) 



(11) 



where 



Qi 



rji 
8 



2?r 



although the spatial integral remains to be carried out. 
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Estimation of the entropy 



For the entropy, Eq[Sl the calculation is more complicated. Starting from 



Hi ({»*}) = 



1 f dx 



In 2 / A 

we remove the logarithm using the replica trick (see [29| ) 



n*» p ({^}i^) in 



P{{r)i}\x') 



Hl «*» = / f n ** *a ~ { [/ 



which can be rewritten 



#i ({%» 



lim 



H 1 {n)-H 1 {l) 



In 2 (n — 1) 

using the spatial averages, defined for an arbitrary real-valued number n of replicas 



i 

f dxi n.r ; <i.r„ t r 

= / -r---r'--r{[hi({xp},n) 



(12) 



(13) 



(14) 



where we have defined a quantity dependent on both the number n of replicas and on the position in space, later to 
be integrated over, of each replica j3: 



hi ({£/?}, n) = drjiYl 



mix a) 



\ (vi 

e ^ 0(7ji) 



(V27T(5) 



We need therefore to carry out integrals over the firing rate of each CA3 unit, rji, in order to estimate hi ({x/3},n), 
while keeping in mind that in the end we want to take n — » 1. Carrying out the integrals yields a below-threshold 
and an above-threshold term 



hi{{xp},n) = Y[^(-g t (x l3 )) 



"(«-1)t 



dfj 



(15) 



where we have defined the quantities 

S({xp}) = - 



—r — : rv [ yZmixp) I - 7 1 -.X Y\m 2 {xp) 

n(n-l) \ ^ J [n-l)^ 



2n(n — 1) ^-^ 

a,j8 



(16) 



and r/ D = (1/ra) Yj$ Vi(&p)i while fj = r] - r) . 

One might think that hi ({xp}, n ~ 1) — > 1 + (n — l)/ij,n ({2^3}, 1) + 0((n — l) 2 , hence in the product over cells, that 
defines the entropy Hi ({iji}), the only terms that survive in the limit n — > 1 would just be the summed single-unit 
contributions obtained from the first derivatives with respect to n. This is not true, however, as taking the replica limit 
produces the counterintuitive effect that replica-tensor products of terms, which individually disappear for n — > 1, 
only vanish to first order in n — 1, as shown by [To| . The replica method is therefore able, in principle, to quantify 
the effect of correlations among units, expressed in entropy terms stemming from the product of hi across units. 
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Briefly, one has 
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(17) 



where the first two rows come from the term below threshold, and the last two from the one above threshold. Then, 
following [Tol | . 
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— — ? I„ - g(^)J + — r (e(x Q ) - Qo) [Q{Xf3) ~ Qo) 

4n(n — 1) 2<t> (—g ) 



and where we have considered that in the limit n — > 1 we have T) /S = g appear in all terms of finite weight. 

The products between the matrices G a ,p attached to each CA3 unit generate the higher order terms in TV. Calculat- 
ing them in our case, in which different CA3 units can receive partially overlapping inputs from DG units, is extremely 
complex (see 11| , where information transmission across a network is also considered) , and we do not pursue here the 
analysis of such higher order terms. One can retrieve the result of the TG model in Ref. [l(| by taking the further 
limit Qo — » 0, which implies 3>(g ) — > 1/2 and <r 2 (g ) — > l/(27r). A further subtlety is that, in taking the n — ► 1 
limit, there is a single replica, say x, which is counted once in the limit, but also several different replicas, denoted 
xf, xff, . . . , whose weights vanish, but which remain to determine e.g. the terms proportional to (n — 1) emerging from 
the derivatives. Thus, in the very last term of Eq. 1171 one has to derive g a with respect to n to produce the T term 
of Eq. [T9l which is absent in [Hj because it vanishes with g a . In the off-diagonal terms of the G matrix there are 
2(n — 1) entries dependent on replicas x and a;/, and {n — l)(n — 2) entries dependent on replicas xf and xff. 

Focusing now solely on terms of order N, note that the term S is effectively a spatial signal. In the n — ► 1 limit it 
can be rewritten, using x for the single surviving replica, as 

S(x, xf) = [fji (x) - fji (xf)] 2 - i [fji (xf) - fji (xff)] 2 . 

This allows us to derive, to order N, our result for the spatial information content, Eq. [51 

Note that when the threshold of each unit tends to — oo, and therefore its mean activation g Q i — ► oo, our units 
behave as threshold-less linear units with gaussian noise, and the information they convey tends to 



,w- r ,« N . f dx dxt , ,_. ,_m2\ 
(I(x,{ Vl })) = —( ——[ Qi (x)- ei (xf)} 2 ) 



(20) 



which is simply expressed in terms of a spatial signal-to-noise ratio, and coincides with the results in Refs. [37| . [Iol |. 
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B. m-field Decomposition 



Eqs. [8] and [9] simply sum equivalent average contributions from each CA3 unit. Each such contribution can then be 
calculated as a series in to, the number of DG fields feeding into the CA3 unit. One can in fact write, for example, 



(4>( ft (f))^(f)) 



5 2 



P(2) 
5 2 



E p(QiMQim 

Qi=0 



Qi,Q 2 =0 



^jPj (x, {x jk }) - T 



E P{Qi)P{Q2) ■ ■ ■ P(Q 7 )$( ft (f)) 

Qi,Q2---Q- 7 =0 

2^ 



where in each term there are 7 active DG units, indexed by j, presynaptic to CA3 unit i, and each has Qj fields 
(including the possibility that Qj — 0), indexed by k. 



A similar expansion can be written for the other terms. 
One then realizes that the spatial component reduces to 
integrals that depend solely on the total number of fields 
to = Qj, no matter how many DG active units they 
come from, and the expansion can be rearranged into an 
expansion in to 

N 00 

(/ (X , { Vt })) = — E G mD m {T) (21) 
rn=Q 

where one of the components in each term is, for example, 

D m (T) = J ^^...^ m{ ^ } )) e ^^{^ } ) 

(22) 

with g(xf, {xj}) = [J^2i = i V 7 (xf — Xj) — T] /8 the mean 
signal-to-noise at position x produced by to fields, from 
no matter how many DG units. The numerical coefficient 
C m , instead, stems from the combination of the distribu- 
tion for the number of fields for each presynaptic DG unit 
active in the environment, which differs between models 
A, B and C, and the Poisson distribution for the number 
of such units 

P(7) = ££e- 
7! 

a = PdgCmf- 

The sum extends in principle to m — > 00, but in practice 
it can be truncated after checking that successive terms 
give vanishing contributions. The appropriate truncation 
point obviously depends on the mean number of fields q, 
as well as on the model distribution of fields per unit. 
Note that the first few terms (e.g. for to = 0, 1, ... ) may 



give negative but not necessarily negligible contributions 
if the effective threshold T is high. 
For model A, 

q Q 

PA(Q) = q ^ 

and combining the two Poisson series one finds 

C m - eK 6 "'- 1 )^!- (23) 
to! 

where Kq = 1 and the other K m (X) are the polynomials 

K 2 =A + A 2 

K 3 =A + 3A 2 + A 3 

K A = A + 7A 2 + 6A 3 + A 4 

< 



given by the modified Khayyam- Tartaglia recursion rela- 
tion 

T(l, m) = T(l - 1, to - 1) + 1 T(l, to - 1) 

and where A = ae~ q . 
For model B, 

l + q \l + qj 
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and combining the Poisson with the exponential series 
one finds 



C n 



(24) 



where again Kq = 1, while the other K m (\) are the dis- 
tinct polynomials 



( Ki = A 

K 2 = A + A 2 /2! 

K 3 = A + 2A 2 /2! + A 3 /3! 

K 4 = A + 3A 2 /2! + 3A 3 /3! + A 4 /4! 



Km =YZi f(«,m) A' 



given by the further modified Khayyam- Tartaglia recur- 
sion relation 

f (I, m) = f (I - 1, m - I) /I + f (/, m - 1) 

and where A = a/(l + 
For model C, 

Pc(Q) = 5iq 

there is no parameter q (i.e., q = 1), and one simply finds 

a m 

C m =e~ a — . (25) 
m! 

Note that in the limit q — > 0, when the mean input per 
CA3 unit fi — aq remains finite, for both models A and 
B one finds 



lim C„ 



which is equivalent to Eq. [25l in line with the fact that 
both models A and B reduce, in the q — ► limit, to 
single- field distributions, but even units with single fields 
become vanishingly rare, so formally one has to scale up 
the mean number of active presynaptic units, a, to keep 
fi = aq finite and establish the correct comparison to 
model C. 



C. Sparsity and Threshold 

The analytical relation between the threshold T of CA3 
units and the sparsity a of the layer is obtained starting 
from the formula defining formula the sparsity acA3 = 



WW) 



[<j(q(x)) + q(x)<S>(q(x))} 2 
[Q(x)a(Q(x)) + $( Q (x))(l + Q 2 (x))]< 



(26) 



Since in the analytical calculation we have T as parame- 
ter, this equation can be taken as a relation a(T) which 



c ■ 0.058 
c - 0.034 
c-0.02 




FIG. 11: Sparsity-Threshold relation. The sparsity a of 
CA3 layer vs. the threshold T of CA3 units, from the nu- 
merical integration of Eq. 1261 Different lines correspond to 
different degrees of connectivity between DG and CA3. 



has to be inverted to allow a comparison with the simu- 
lations, which are run controlling the sparsity level at a 
predefined level (in our case a — 0.1) and adjusting the 
threshold parameter accordingly. The inversion requires 
using the m-field decomposition and numerical integra- 
tion. A graphical example of the numerical relation is 
given in Fig. 111! 



D. Simulations 



The mathematical model described above was simu- 
lated with a network of 500 DG cells and 500 CA3 cells. A 
virtual rat explores a continuous two dimensional space, 
intended to represent a lsqm square environment but re- 
alized as a torus, with periodic boundary conditions. For 
numerical purposes (estimating mutual information), the 
environment is discretized in a grid of 20 x 20 locations, 
whereas trajectories are in continuos space, but in dis- 
cretized time steps. In each time step (intended to corre- 
spond to roughly 62.5ms, half a theta cycle, the virtual 
rat moves half a grid unit (2.5cm) in a direction sim- 
ilar to the direction of the previous time step, with a 
small amount of noise. To allow construction of a full 
localization matrix with good statistics, simulations are 
run for typically 400,000 time steps (while for the simpli- 
fied translationally invariant matrix 5,000 steps would be 
sufficient). The space has periodic boundary conditions, 
as in a torus, to avoid border effects; the longest possible 
distance between any two locations is hence equal to 14.1 
grid units, or 70cm. 
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1. DG place fields 

After assigning a number of firing fields for each DG 
units, according to the distributions of models A, B and 
C, we assign to each field a randomly chosen center. The 
shape of the field is then given by a Gaussian bell with 
that center. The tails of the Gausssian function are trun- 
cated: the value of the function is set to zero when the 
distance from the center is larger than a fixed radius 

r = \ — , with f = 0.1 the ratio between the area of 
the field and the environment area A. In the standard 
model, only about 3 percent of the DG units are active 
in the environment, in agreement with experimental find- 
ings 0; i.e. the DG firing probability is pdg — 0.033. 
The firing of DG units is not affected by noise, nor by 
any further threshold. Peak firing is conventionally set, 

2 

in the center of the field, at the value = 2.02, but DG 
units can fire at higher levels if they are assigned two or 
more overlapping fields. 

2. CA3 activation 

CA3 units fire according to Eq. [2j the firing of a CA3 
unit is a linear function of the total incoming DG in- 
put, distorted by a noise term. This term is taken from 
a gaussian distribution centered on zero, with variance 
5 = 1, and it changes for each unit and each time step. 
A threshold is imposed in the simulations to model the 
action of inhibition, hypothesizing that it serves to ad- 
just the sparsity a of CA3 activity to its required value. 
The sparsity is defined as 

_ (mm 2 
a (vim 

and it is fixed to a = 0.1. This implies that the activity of 
the CA3 cells population is under tight inhibitory control. 

3. The decoding procedure and information extraction 

At each time step, the firing vector of a set of CA3 units 
is compared to all the average vectors recorded at each 
position in the 20 x 20 grid, for the same sample, in a test 
trial (these are called template vectors). The comparison 
is made calculating the Euclidean distance between the 
current vector and each template, and the position of the 
closest template is taken to be the decoded position at 



that time step, for that sample. This procedure has been 
termed maximum likelihood Euclidean distance decod- 
ing [l2j . The frequency of each pair of decoded and real 
positions are compiled in a so-called "confusion matrix" , 
or localization matrix, that reflects the ensemble of con- 
ditional probabilities {P ({rji} \x)} for that set of units. 
Should decoding "work" in a perfect manner, in the sense 
of always detecting the correct position in space of the 
virtual rat, the confusion matrix would be the identity 
matrix. From the confusion matrix obtained at the end 
of the simulation, the amount of information is extracted, 
and plotted versus the number of CA3 units present in 
the set. We averaged extensively over CA3 samples, as 
there are large fluctuations from sample to sample, i.e. 
for each given number of CA3 units we randomly picked 
several different groups of CA3 units and then averaged 
the mutual information values obtained. In all the results 
reported we averaged also over 3-4 simulation run with 
a different random number generator, i.e. over different 
trajectories. The same procedure leading to the infor- 
mation curve was repeated for different values of the pa- 
rameters. In all the information measures we reported, 
we also corrected for the limited sampling bias, as dis- 
cussed by [44|. In our case of spatial information, the 
bias is essentially determined by the spatial binning we 
used (20 x 20) and by the decoding method (32l |. 

4- Fitting 

We fit the information curves obtained in simulations 
as a function of N in order to get the values of the two 
most relevant parameter that describe their shape: the 
initial slope I\ (i.e. the average information conveyed by 
the activity of individual units) and the total amount of 
information 1^ (i.e. the asymptotic saturation value). 
The function we used for the fit is the following 

F{N)=I 00 (l-er N ^) (27) 

In most cases the fit was in excellent agreement with in- 
dividual data points, as expected on the basis of previous 
analyses [37j j . 
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